4 research outputs found

    Combining Unsupervised, Supervised, and Rule-based Algorithms for Text Mining of Electronic Health Records - A Clinical Decision Support System for Identifying and Classifying Allergies of Concern for Anesthesia During Surgery

    Get PDF
    Undisclosed allergic reactions of patients are a major risk when undertaking surgeries in hospitals. We present our early experience and preliminary findings for a Clinical Decision Support System (CDSS) being developed in a Norwegian Hospital Trust. The system incorporates unsupervised and supervised machine learning algorithms in combination with rule-based algorithms to identify and classify allergies of concern for anesthesia during surgery. Our approach is novel in that it utilizes unsupervised machine learning to analyze large corpora of narratives to automatically build a clinical language model containing words and phrases of which meanings and relative meanings are also learnt. It further implements a semi-automatic annotation scheme for efficient and interactive machine-learning, which to a large extent eliminates the substantial manual annotation (of clinical narratives) effort necessary for the training of supervised algorithms. Validation of system performance was performed through comparing allergies identified by the CDSS with a manual reference standard

    DRIVERS AND BARRIERS TO STRUCTURING INFORMATION IN ELECTRONIC HEALTH RECORDS

    No full text
    While much research exists on aspects or phenomena related to or depending on structuring of infor-mation in the healthcare context, most of this has been limited to the study of specific Electronic Health Record (EHR) implementation, or to certain capabilities or functionalities of EHRs such as decision support, the narrative, and clinical classifications and terminology. The phenomenon of in-formation structuring in EHRs per se, has received little research attention. This article presents a review on the subject of information structuring in EHRs. While research shows that increasing struc-turing of health information may be favorable to healthcare, there are also caveats. This paper expos-es and discusses both salient drivers and barriers described by the literature, by examining the phe-nomenon through seven identified themes: clinical decision support; competence; continuity of care; management; secondary uses; patient safety and quality of care; and patient empowerment. Even though increased use of structured health data (depending on context) has the potential to cause ma-jor impacts to healthcare, a middle path represented by the synergistic co-existence of both structured data and unstructured information seems to be the most feasible to follow for healthcare at the time being based on the available literature

    Combining unsupervised, supervised and rule-based learning: the case of detecting patient allergies in electronic health records

    Get PDF
    Abstract Background Data mining of electronic health records (EHRs) has a huge potential for improving clinical decision support and to help healthcare deliver precision medicine. Unfortunately, the rule-based and machine learning-based approaches used for natural language processing (NLP) in healthcare today all struggle with various shortcomings related to performance, efficiency, or transparency. Methods In this paper, we address these issues by presenting a novel method for NLP that implements unsupervised learning of word embeddings, semi-supervised learning for simplified and accelerated clinical vocabulary and concept building, and deterministic rules for fine-grained control of information extraction. The clinical language is automatically learnt, and vocabulary, concepts, and rules supporting a variety of NLP downstream tasks can further be built with only minimal manual feature engineering and tagging required from clinical experts. Together, these steps create an open processing pipeline that gradually refines the data in a transparent way, which greatly improves the interpretable nature of our method. Data transformations are thus made transparent and predictions interpretable, which is imperative for healthcare. The combined method also has other advantages, like potentially being language independent, demanding few domain resources for maintenance, and able to cover misspellings, abbreviations, and acronyms. To test and evaluate the combined method, we have developed a clinical decision support system (CDSS) named Information System for Clinical Concept Searching (ICCS) that implements the method for clinical concept tagging, extraction, and classification. Results In empirical studies the method shows high performance (recall 92.6%, precision 88.8%, F-measure 90.7%), and has demonstrated its value to clinical practice. Here we employ a real-life EHR-derived dataset to evaluate the method’s performance on the task of classification (i.e., detecting patient allergies) against a range of common supervised learning algorithms. The combined method achieves state-of-the-art performance compared to the alternative methods we evaluate. We also perform a qualitative analysis of common word embedding methods on the task of word similarity to examine their potential for supporting automatic feature engineering for clinical NLP tasks. Conclusions Based on the promising results, we suggest more research should be aimed at exploiting the inherent synergies between unsupervised, supervised, and rule-based paradigms for clinical NLP
    corecore